skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Niu, Luyao"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Large language models (LLMs) are expected to follow in- structions from users and engage in conversations. Tech- niques to enhance LLMs’ instruction-following capabilities typically fine-tune them using data structured according to a predefined chat template. Although chat templates are shown to be effective in optimizing LLM performance, their impact on safety alignment of LLMs has been less understood, which is crucial for deploying LLMs safely at scale. In this paper, we investigate how chat templates affect safety alignment of LLMs. We identify a common vulnerability, named ChatBug, that is introduced by chat templates. Our key insight to identify ChatBug is that the chat templates provide a rigid format that need to be followed by LLMs, but not by users. Hence, a malicious user may not necessar- ily follow the chat template when prompting LLMs. Instead, malicious users could leverage their knowledge of the chat template and accordingly craft their prompts to bypass safety alignments of LLMs. We study two attacks to exploit the ChatBug vulnerability. Additionally, we demonstrate that the success of multiple existing attacks can be attributed to the ChatBug vulnerability. We show that a malicious user can exploit the ChatBug vulnerability of eight state-of-the- art (SOTA) LLMs and effectively elicit unintended responses from these models. Moreover, we show that ChatBug can be exploited by existing jailbreak attacks to enhance their at- tack success rates. We investigate potential countermeasures to ChatBug. Our results show that while adversarial train- ing effectively mitigates the ChatBug vulnerability, the vic- tim model incurs significant performance degradation. These results highlight the trade-off between safety alignment and helpfulness. Developing new methods for instruction tuning to balance this trade-off is an open and critical direction for future research. 
    more » « less
    Free, publicly-accessible full text available February 25, 2026
  2. Cyber-physical systems (CPS) are required to satisfy safety constraints in various application domains such as robotics, industrial manufacturing systems, and power systems. Faults and cyber attacks have been shown to cause safety violations, which can damage the system and endanger human lives. Resilient architectures have been proposed to ensure safety of CPS under such faults and attacks via methodologies including redundancy and restarting from safe operating conditions. The existing resilient architectures for CPS utilize different mechanisms to guarantee safety, and currently, there is no common framework to compare them. Moreover, the analysis and design undertaken for CPS employing one architecture is not readily extendable to another. In this article, we propose a timing-based framework for CPS employing various resilient architectures and develop a common methodology for safety analysis and computation of control policies and design parameters. Using the insight that the cyber subsystem operates in one out of a finite number of statuses, we first develop a hybrid system model that captures CPS adopting any of these architectures. Based on the hybrid system, we formulate the problem of joint computation of control policies and associated timing parameters for CPS to satisfy a given safety constraint and derive sufficient conditions for the solution. Utilizing the derived conditions, we provide an algorithm to compute control policies and timing parameters relevant to the employed architecture. We also note that our solution can be applied to a wide class of CPS with polynomial dynamics and also allows incorporation of new architectures. We verify our proposed framework by performing a case study on adaptive cruise control of vehicles. 
    more » « less
  3. This paper studies the synthesis of controllers for cyber-physical systems (CPSs) that are required to carry out complex time-sensitive tasks in the presence of an adversary. The time-sensitive task is specified as a formula in the metric interval temporal logic (MITL). CPSs that operate in adversarial environments have typically been abstracted as stochastic games (SGs); however, because traditional SG models do not incorporate a notion of time, they cannot be used in a setting where the objective is time-sensitive. To address this, we introduce durational stochastic games (DSGs). DSGs generalize SGs to incorporate a notion of time and model the adversary’s abilities to tamper with the control input (actuator attack) and manipulate the timing information that is perceived by the CPS (timing attack). We define notions of spatial, temporal, and spatio-temporal robustness to quantify the amounts by which system trajectories under the synthesized policy can be perturbed in space and time without affecting satisfaction of the MITL objective. In the case of an actuator attack, we design computational procedures to synthesize controllers that will satisfy the MITL task along with a guarantee of its robustness. In the presence of a timing attack, we relax the robustness constraint to develop a value iteration-based procedure to compute the CPS policy as a finite-state controller to maximize the probability of satisfying the MITL task. A numerical evaluation of our approach is presented on a signalized traffic network to illustrate our results. 
    more » « less